Near-Neighbor Query Performance in Search Trees

نویسندگان

Shuanhu Wang

Joseph M. Hellerstein

Ilya Lipkind

چکیده

Near-neighbor search is an increasingly important operation for queries over multimedia, text, and other non-standard datatypes. In large databases, near-neighbor searches must be enhanced by indexed retrieval for e ciency. In this paper, we present a detailed analysis of three proposals for near-neighbor search: one based on the R-tree, and two which motivated the invention of new trees, namely the SS-tree and SR-tree. We nd that while the new trees do improve performance, the reason for their improvement comes mostly from a new Penalty metric, and not from a variety of other details in their implementation. Our analysis was done using a Generalized Search Tree, which both allowed us to easily do a fair comparison, and also provided the framework for a clearer analysis of the issues at hand.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Probabilistic cost model for nearest neighbor search in image retrieval

We present a probabilistic cost model to analyze the performance of the kd-tree for nearest neighbor search in the context of content-based image retrieval. Our cost model measures the expected number of kd-tree nodes traversed during the search query. We show that our cost model has high correlations with both the observed number of traversed nodes and the runtime performance of search queries...

متن کامل

Which Space Partitioning Tree to Use for Search?

We consider the task of nearest-neighbor search with the class of binary-spacepartitioning trees, which includes kd-trees, principal axis trees and random projection trees, and try to rigorously answer the question “which tree to use for nearestneighbor search?” To this end, we present the theoretical results which imply that trees with better vector quantization performance have better search ...

متن کامل

Distance Based Indexing for String Proximity Search

In many database applications involving string data, it is common to have near neighbor queries (asking for strings that are similar to a query string) or nearest neighbor queries (asking for strings that are most similar to a query string). The similarity between strings is defined in terms of a distance function determined by the application domain. The most popular string distance measures a...

متن کامل

Nearest Neighbor Search using Kd-trees

We suggest a simple modification to the kd-tree search algorithm for nearest neighbor search resulting in an improved performance. The Kd-tree data structure seems to work well in finding nearest neighbors in low dimensions but its performance degrades even if the number of dimensions increases to more than three. Since the exact nearest neighbor search problem suffers from the curse of dimensi...

متن کامل

Orthogonal Range Searching in Moderate Dimensions: k-d Trees and Range Trees Strike Back

We revisit the orthogonal range searching problem and the exact `∞ nearest neighbor searching problem for a static set of n points when the dimension d is moderately large. We give the first data structure with near linear space that achieves truly sublinear query time when the dimension is any constant multiple of log n. Specifically, the preprocessing time and space are O(n) for any constant ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1998

Near-Neighbor Query Performance in Search Trees

نویسندگان

چکیده

منابع مشابه

Probabilistic cost model for nearest neighbor search in image retrieval

Which Space Partitioning Tree to Use for Search?

Distance Based Indexing for String Proximity Search

Nearest Neighbor Search using Kd-trees

Orthogonal Range Searching in Moderate Dimensions: k-d Trees and Range Trees Strike Back

عنوان ژورنال:

اشتراک گذاری